Using Derivation Trees for Informative Treebank Inter-Annotator Agreement Evaluation
نویسندگان
چکیده
This paper discusses the extension of a system developed for automatic discovery of treebank annotation inconsistencies over an entire corpus to the particular case of evaluation of inter-annotator agreement. This system makes for a more informative IAA evaluation than other systems because it pinpoints the inconsistencies and groups them by their structural types. We evaluate the system on two corpora (1) a corpus of English web text, and (2) a corpus of Modern British English.
منابع مشابه
A TAG-derived Database for Treebank Search and Parser Analysis
Recent work has proposed the use of an extracted tree grammar as the basis for treebank analysis, in which queries are stated over the elementary trees, which are small chunks of syntactic structure. In this work we integrate search over the derivation tree with this approach in order to analyze differences between two sets of annotation on the same text, an important problem for parser analysi...
متن کاملAnnotators' Certainty and Disagreements in Coreference and Bridging Annotation in Prague Dependency Treebank
In this paper, we present the results of the parallel Czech coreference and bridging annotation in the Prague Dependency Treebank 2.0. The annotation is carried out on dependency trees (on the tectogrammatical layer). We describe the inter-annotator agreement measurement, classify and analyse the most common types of annotators’ disagreement. On two selected long texts, we asked the annotators ...
متن کاملThe Penn Discourse Treebank
This paper describes a new discourse-level annotation project – the Penn Discourse Treebank (PDTB) – that aims to produce a large-scale corpus in which discourse connectives are annotated, along with their arguments, thus exposing a clearly defined level of discourse structure. The PDTB is being built directly on top of the Penn Treebank and Propbank, thus supporting the extraction of useful sy...
متن کاملAnnotating Discourse Connectives And Their Arguments
This paper describes a new, large scale discourse-level annotation project – the Penn Discourse TreeBank (PDTB). We present an approach to annotating a level of discourse structure that is based on identifying discourse connectives and their arguments. The PDTB is being built directly on top of the Penn TreeBank and Propbank, thus supporting the extraction of useful syntactic and semantic featu...
متن کاملTowards Building Parallel Dependency Treebanks: Intra-Chunk Expansion and Alignment for English Dependency Treebank
The paper presents our work on the annotation of intra-chunk dependencies on an English treebank that was previously annotated with Inter-chunk dependencies, and for which there exists a fully expanded parallel Hindi dependency treebank. This provides fully parsed dependency trees for the English treebank. We also report an analysis of the inter-annotator agreement for this chunk expansion task...
متن کامل